Skip to content

Conversation

@AnimeshAgarwal28
Copy link

@AnimeshAgarwal28 AnimeshAgarwal28 commented Oct 2, 2025

This PR implements GPR definition in the Unified Database, addressing issue #1085.

Problem

The UDB currently lacks information about General Purpose Registers in YAML format.

Solution

This PR adds structured register file support to UDB, starting with RISC-V general purpose registers as a foundation for future register file additions.

Changes

New Files

  • spec/schemas/register_schema.json: JSON schema defining the structure for YAML register file.
  • spec/std/isa/register/gpr.yaml: Complete RISC-V general purpose register file with all 32 registers, proper ABI mnemonics, calling convention roles, and conditional support for RV32E (16 registers)
  • tools/ruby-gems/udb/lib/udb/obj/register_file.rb: RegisterFile class extending TopLevelDatabaseObject for programmatic access to register file data

Modified Files

  • spec/schemas/schema_defs.json: Added register-specific schema definitions
  • spec/std/isa/README.adoc: Updated architecture documentation to include register files alongside extensions/instructions/CSRs with usage examples
  • tools/ruby-gems/udb/lib/udb/obj/database_obj.rb: Added RegisterFile kind to DatabaseObject::Kind enum for type system integration
  • tools/ruby-gems/udb/lib/udb/architecture.rb: Added register file loading support to architecture system

Register Details

The GPR implementation includes:

  • All 32 general purpose registers (x0-x31) with standard names
  • Proper ABI mnemonics (zero, ra, sp, gp, tp, t0-t6, s0-s11, a0-a7)
  • Calling convention classifications (caller/callee saved, arguments, return values)
  • Special register roles (zero register, stack pointer, frame pointer, etc.)
  • Conditional presence for RV32E embedded profile

Benefits

  • Single Source of Truth: Eliminates need for hardcoded register mappings in downstream tools
  • Consistency: Ensures all tools use identical register information
  • Automation Ready: Structured format enables automatic code generation

Future Work

This establishes the foundation for adding other register files mentioned in issue #1085:

  • Floating Point Registers
  • Vector Registers

Closes #1085

Implements General Purpose Register (GPR) support in the Unified
Database as requested in issue riscv-software-src#1085.

Changes:
- spec/schemas/register_schema.json: New schema defining structure for
  register files
- spec/schemas/schema_defs.json: Add register-specific schema definition
- spec/std/isa/register/gpr.yaml: New register file implementing all 32
  RISC-V general purpose registers with proper ABI mnemonics, calling
  convention roles, and conditional support for RV32E
- spec/std/isa/README.adoc: Update architecture documentation to include
  register files
- tools/ruby-gems/udb/lib/udb/obj/register_file.rb: New RegisterFile
  class extending TopLevelDatabaseObject
- tools/ruby-gems/udb/lib/udb/obj/database_obj.rb: Add RegisterFile kind
  to DatabaseObject::Kind enum for type system integration
- tools/ruby-gems/udb/lib/udb/architecture.rb: Add register file loading
  support to architecture

The implementation follows RISC-V ABI specifications and provides a
single source of truth for GPR information.

Resolves Issue: riscv-software-src#1085
@AnimeshAgarwal28 AnimeshAgarwal28 changed the title Add GPR Information to UDB feat: Add GPR Information to UDB Oct 2, 2025
@AnimeshAgarwal28 AnimeshAgarwal28 changed the title feat: Add GPR Information to UDB feat: add GPR Information to UDB Oct 2, 2025
Copy link
Collaborator

@ThinkOpenly ThinkOpenly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, frankly, pretty awesome! Great work, @AnimeshAgarwal28 !

There are a few comments that need discussion, but this is an impressive start!

Copy link
Collaborator

@dhower-qc dhower-qc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent first draft!

In addition to the inline comments, a few high-level thoughts:

To really get this integrated, we'll want to reflect it in a few places where the X registers are built in concepts. A few that come to mind: the IDL symbol table and the generated ISS.

In the symbol table, you'll find this:

@scopes = [{
"X" => Var.new(
"X",
Type.new(:array, sub_type: XregType.new(@mxlen.nil? ? 64 : @mxlen), width: 32, qualifiers: [:global])
),
"XReg" => XregType.new(@mxlen.nil? ? 64 : @mxlen),

That's where "X" gets added in global scope as an array of Bits. Instead of that, we'll want to construct (all of) the register files based on the YAML. Hopefully you can get some ideas of to do so looking around here where the symbol table is instantiated:

@symtab =
Idl::SymbolTable.new(
mxlen:,
possible_xlens:,
params:,
builtin_funcs: symtab_callbacks,
builtin_enums: [
Idl::SymbolTable::EnumDef.new(
name: "ExtensionName",
element_values: (1..extensions.size).to_a,
element_names: extensions.map(&:name)
),
Idl::SymbolTable::EnumDef.new(
name: "ExceptionCode",
element_values: exception_codes.map(&:num),
element_names: exception_codes.map(&:var)
),
Idl::SymbolTable::EnumDef.new(
name: "InterruptCode",
element_values: interrupt_codes.map(&:num),
element_names: interrupt_codes.map(&:var)
)
],
name: @name,
csrs:
)
overlay_path = config.info.overlay_path

In the ISS, the X registers are defined on a "hart" object, as you can see here:

uint64_t xreg(unsigned num) const override {
if (num >= 32) {
throw std::out_of_range("X register indices are 0 - 31, inclusive");
}
return _xreg(num).get();
}
PossiblyUnknownBits<MXLEN> _xreg(unsigned num) const {
return m_xregs[num];
}
template <template <unsigned, bool> class BitsClass, unsigned N, bool Signed>
requires (BitsType<BitsClass<N, Signed>>)
PossiblyUnknownBits<MXLEN> _xreg(const BitsClass<N, Signed>& num) const {
return m_xregs[num.get()];
}
// XRegister<MXLEN>& xregRef(unsigned num) { return m_xregs[num]; }
void set_xreg(unsigned num, uint64_t value) override {
if (num >= 32) {
throw std::out_of_range("X register indices are 0 - 31, inclusive");
}
_set_xreg(Bits<8>{num}, Bits<MXLEN>{value});
}
template <
template <unsigned, bool> class IdxType, unsigned IdxN, bool IdxSigned,
template<unsigned, bool> class ValueType, unsigned ValueN, bool ValueSigned
>
requires (BitsType<IdxType<IdxN, IdxSigned>> && BitsType<ValueType<ValueN, ValueSigned>>)
void _set_xreg(const IdxType<IdxN, IdxSigned>& num, const ValueType<ValueN, ValueSigned>& value) {
if (num != 0_b) {
m_xregs[static_cast<unsigned>(num.get())] = value;
}
}

<%- if cfg_arch.mxlen.nil? -%>
std::array<PossiblyUnknownRuntimeBits<64>, 32> m_xregs;
<%- else -%>
std::array<PossiblyUnknownBits<MXLEN>, 32> m_xregs;
<%- end -%>

We'll want to think this one through a bit more with @henrikg-qc

Comment on lines +111 to +112
when:
not: { name: E }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #891, this would be:

  when:
    not:
      extension:
        name: E

@dhower-qc dhower-qc requested a review from henrikg-qc October 3, 2025 14:42
Changes:
- rename schema from register_schema.json to register_file_schema.json,
  remove "$ref": "#/$defs/register_file" from bottom of the schema,
  and introduce register_file_name.
- drop per-register length and index fields, make abi mnemonics an
  array, and add 'caller_saved'/'callee_saved' booleans with default
  value of 'false'.
- remove the 'count' helper in favor of conditioning individual
  register entries directly and infer indices from position.
- fix the description for XLEN behavior and remove empty role arrays.
- update the register-file section of README.adoc so the example
  mirrors the new schema.
- register_file.rb: expose register data via '#data' and return
  Sorbet enum values for roles.

Signed-off-by: Animesh Agarwal <animeshagarwal28@gmail.com>
},
"register_file_name": {
"type": "string",
"pattern": "^[A-Za-z][A-Za-z0-9_.-]*$",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's restrict this to a single character. Since we already have X and f, either case:

Suggested change
"pattern": "^[A-Za-z][A-Za-z0-9_.-]*$",
"pattern": "^[A-Za-z]$",

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. And perhaps to take it a step further: make them all single letter uppercase? Right now, we have the unfortunate situation that X is uppercase but f and v are lowercase. That's out of necessity since we declare f and v as generic globals right now, and uppercase variable names are const by definition in IDL. X is special since it's builtin.

By having dedicated register file definitions, we can avoid the discrepancy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, further constrain this pattern to single-character upper case, and we'll need to change the current f and v register files to F and V in the YAML and in the Ruby code?

@ThinkOpenly
Copy link
Collaborator

Interesting... the CI is complaining:

ERROR: 'name' key (X) must match filename (gpr) in /home/runner/work/riscv-unified-db/riscv-unified-db/gen/spec/_/register/gpr.yaml

I'm not sure exactly who's complaining about that, but I guess we should rename the file to X.yaml.

…tern

Changes:
- Add 'writable' boolean property to register schema to indicate if a register is writable.
- Update gpr.yaml to X.yaml, register directory to register_file.
- Restrict regex for register_file_name to a single character.
- Update README.adoc.

Signed-off-by: Animesh Agarwal <animeshagarwal28@gmail.com>
@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.05%. Comparing base (5d6c397) to head (f1b0a8d).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1150   +/-   ##
=======================================
  Coverage   46.05%   46.05%           
=======================================
  Files          11       11           
  Lines        4942     4942           
  Branches     1345     1345           
=======================================
  Hits         2276     2276           
  Misses       2666     2666           
Flag Coverage Δ
idlc 46.05% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ThinkOpenly
Copy link
Collaborator

CI detected a couple of lines with trailing whitespace.

@dhower-qc
Copy link
Collaborator

Interesting... the CI is complaining:

ERROR: 'name' key (X) must match filename (gpr) in /home/runner/work/riscv-unified-db/riscv-unified-db/gen/spec/_/register/gpr.yaml

I'm not sure exactly who's complaining about that, but I guess we should rename the file to X.yaml.

We enforce the invariant that any database file must match the corresponding name:, which makes it easier to find the raw data.

- name: x5
abi_mnemonics: [t0]
caller_saved: true
roles: [temporary]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't x5 sometimes a return address?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reference I can find relates to the use of x5 with shadow stack operations. Is that what you are referring to?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. X5 is also called the "alternate link register", and is used for calling some functions where they don't want to trash ra (millicode, and outlining).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess part of the reason this is an alternate link register is because it is a temporary register, and functions that use x5 this way know it when they are constructed.

X5 does have specific support in Zicfiss for being used as a return address.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X5 is also called the "alternate link register"

Called by whom? I can't find it in the RISC-V APIs Specification (Version 1.0: Ratified), nor in the RISC-V Assembly Programmer's Manual (2025-02-05).

I'm just wondering if this is an ad-hoc classification, since it doesn't seem to be officially documented as anything other than a temporary register.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GPR Information to UDB

4 participants